Average-Case Quantum Query Complexity* 



Andris Ambainis^ 
Computer Science Department 
University of California 
Berkeley CA 94720 
USA 

ambainisOcs . berkeley . edu 



Ronald de Wolf^ 

CWI 
P.O. Box 94079 
1090 GB Amsterdam 
The Netherlands 
rdewolf Ocwi .nl 



February 1, 2008 



Abstract 

We compare classical and quantum query complexities of total Boolean functions. It is 
known that for worst-case complexity, the gap between quantum and classical can be at most 
polynomial We show that for average-case complexity under the uniform distribution, 
quantum algorithms can be exponentially faster than classical algorithms. Under non-uniform 
distributions the gap can even be super-exponential. We also prove some general bounds for 
average-case complexity and show that the average-case quantum complexity of MAJORITY 
under the uniform distribution is nearly quadratically better than the classical complexity. 



1 Introduction 

The field of quantum computation studies the power of computers based on quantum mechanical 
principles. So far, most quantum algorithms — and all physically implemented ones — have operated 
in the so-called black-box setting. In the black-box model, the input of the function / that we want 
to compute can only be accessed by means of queries to a "black-box" . This returns the ith bit of 
the input when queried on i. The complexity of computing / is measured by the required number 
of queries. In this setting we want quantum algorithms that use significantly fewer queries than 
the best classical algorithms. Examples of quantum black-box algorithms that are provably better 
than any classical algorithm can be found in fl^ , 25, |l^ 0) Even Shor's quantum algorithm 



for period- finding, which is the core of his efficient factoring algorithm [24|, can be viewed as 
black-box algorithm |11| 



a 



We restrict our attention to computing total Boolean functions f on N variables. The query 
complexity of / depends on the kind of errors one allows. For example, we can distinguish between 
exact computation, zero-error computation (a.k.a. Las Vegas), and bounded-error computation 
(Monte Carlo). In each of these models, worst-case complexity is usually considered: the complexity 
is the number of queries required for the "hardest" input. Let D{f), R{f) and Q{f) denote 
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Aspects of Computer Science (STACS'2000), Springer, LNCS 1770, 2000. 

^Part of this work was done when visiting Microsoft Research. Supported by Microsoft Research Fellowship and 
NSF Grant CCR-9800024. 

^Partially supported by the EU fifth framework project QAIP, IST-1999-11234. Also afiiliated with the ILLC, 
University of Amsterdam. 



the worst-case query complexity of computing / for classical deterministic algorithms, classical 
randomized bounded-error algorithms, and quantum bounded-error algorithms, respectively. More 
precise definitions will be given in the next section. Since quantum bounded-error algorithms are at 
least as powerful as classical bounded-error algorithms, and classical bounded-error algorithms are 
at least as powerful as deterministic algorithms, we have Q{f) < Rif) ^ D(f). The main quantum 
success here is Grover's algorithm [|14|. It can compute the OR-function with bounded-error using 
e(//V) queries (which is optimal g, |, Thus Q(OR) G e(ViV), whereas D{OR) = N 

and i2(0R) S Q{N). This is the biggest gap known between quantum and classical worst-case 
complexities for total functions. (In contrast, for partial Boolean functions the gap can be much 
bigger |12, |2^, |l^].) In fact, it is known that the gap between D{f) and Q{f) is at most polynomial 



for every total /: D{f) G 0{Q{f) ) . This is similar to the best known relation between classical 
deterministic and randomized algorithms: D{f) G 0{R{f)^) pl[| . 

Given some probability distribution /x on the set of inputs {0, 1}^ one may also consider average- 
case complexity instead of worst-case complexity. Average-case complexity concerns the expected 
number of queries needed when the input is distributed according to ;U. If the hard inputs receive 
little ^-probability, then average-case complexity can be significantly smaller than worst-case com- 
plexity. Let D^{f), R'^{f), and Q^{f) denote the average-case analogues of D(f), R{f), and Q{f), 
respectively, to be defined more precisely in the next section. Again Q^{f) < R^if) < D^{f). The 
objective of this paper is to compare these measures and to investigate the possible gaps between 
them. Our main results are: 

• Under uniform /i, Q^{f) and R^{f) can be super-exponentially smaller than D^(f). 

• Under uniform ^u, Q^{f) can be exponentially smaller than R^{f). Thus the polynomial 
relation that holds between quantum and classical query complexities in the case of worst- 
case complexity |Q does not carry over to the average-case setting. 

• Under non-uniform /i the gap can be even larger: we give distributions /i where Q^{OR) is 
constant, whereas ii^(OR) is almost ^/N . 

• For every / and R^{f) is lower bounded by the expected block sensitivity Efj\bs{f)] and 
Q^{f) is lower bounded by £'^[a/6s(/)]. 

• For the MAJORITY-function under uniform ^, we have that Q>^{f) £ 0(\/]V(log iV)^) and 
Q^(/) G n{VN). In contrast, R^'{f) E n{N). 

• For the PARITY-function, the gap between and R^ can be quadratic, but not more. 
Under uniform //, PARITY has Q^(/) en{N). 



2 Definitions 

Let / : {0, 1}^ {0, 1} be a Boolean function. This function is symmetric if f{X) only depends 
on \X\, the Hamming weight (the number of Is) of X. We will in particular consider the following 
symmetric functions: OR{X) = 1 iff \X\ > 1; MAJ(X) = 1 iff \X\ > N/2; PARITY(X) = 1 iff \X\ 
is odd. If X e {0, 1}^ is an input and S a set of (indices of) variables, we use X'^ to denote the 
input obtained by flipping the values of the 5- variables in X. The block sensitivity hsx{f) of / on 
an input X is the maximal number h for which there are b disjoint sets of variables ^i, . . . , such 
that f{X) / /(X'^') for alll < i < 6. The block sensitivity bs{f) of / is maxx bsx{f). 
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We are interested in the question how many bits of the input have to be queried in order to 
compute /, either for the worst-case or average-case input. We assume famiharity with classical 
computation and briefly sketch the definition of quantum query algorithms. For a general intro- 
duction to quantum computing, see the book of Nielsen and Chuang |20|. For more details about 



(quantum) query complexity we refer to [10|. 



An m-qubit state is a 2"^-dimensional unit vector of complex numbers, written X^xejo,!}"" oix\x)- 
The complex number ax is called the amplitude of the basis state |x). A T-query quantum algorithm 
corresponds to a unitary transformation 

A = UtOUt-iO ...UiOUq. 

Here the Uj are unitary transformations on m qubits. These Uj are independent of the input. Each 
O corresponds to a query to the input X S {0, 1}^, formalized as the unitary transformation 

|z, 6, z) ^ |i, 6 © Xj, z). 

Here z G {1, . . . , A^}, b G {0, 1}, © is addition modulo 2, and z G {0, i}m~iog^-i ig the workspace, 
which remains unaffected by the query. Intuitively, O just gives us the bit Xi when queried on i. 
We will sometimes use the word "oracle" to refer to X as well as to the corresponding O. The 
initial state of the algorithm is the all-zero state lO™"). The final state is AlO™'), which depends on 
the input X via the T queries that are made. A measurement of a dedicated output hit of the final 
state will yield the output. It can be shown that this linear-algebraic quantum model is at least as 
strong as classical randomized computation: any classical T-query randomized algorithm can be 
simulated by a T-query quantum algorithm having the same error probabilities. 

As described above, the quantum algorithm will make exactly T queries on every input X. Since 
we are interested in average-case number of queries and the required number of queries will depend 
on the input X, we need to allow the algorithm to give an output after fewer than T queries. We 
will do that by measuring, after each Uj, a dedicated flag-qubit of the intermediate state at that 
point (this measurement may alter the state). This bit indicates whether the algorithm is already 
prepared to stop and output a value. If this bit is 1, then we measure the output bit, output 
its value A{X) £ {0, 1} and stop; if the flag-bit is we let the algorithm continue with the next 
query O and f^j+i. Note that the number of queries that the algorithm makes on input X is now 
a random variable, since it depends on the probabilistic outcome of measuring the flag-qubit after 
each step. We use Ta{X) to denote the expected number of queries that A makes on input X. The 
Boolean output A{X) of the algorithm is a random variable as well. 

We mainly focus on three kinds of algorithms for computing /: classical deterministic, classical 
randomized bounded-error, and quantum bounded-error algorithms. Let !'(/) denote the set of 
classical deterministic algorithms that compute /. Let 7^(/) = {classical A \ yX G {0, 1}^ : 
Pr[A(X) = f{X)] > 2/3} be the set of classical randomized algorithms that compute / with 
bounded error probability. The error probability 1/3 is not essential; it can be reduced to any 
small e by running the algorithm 0(log(l/e)) times and outputting the majority answer of those 
runs. Similarly we let Q{f) = {quantum A\yX £ {0, 1}^ : Ft[A{X) = f{X)] > 2/3} be the set 
of bounded-error quantum algorithms for /. We define the following worst-case complexities: 

D(f) = min max Ta(X) 
yle2?(/) xe{o,i}^ 

R( f) = min max Ta (X) 

AeTZ(f) xe{o,i}^ 

Qif) = min max Ta(X) 

AeS(/)XG{o,i}^ 
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D{f) is also known as the decision tree complexity of / and R{f) as the bounded- error decision tree 
complexity of /. Since quantum computation generalizes randomized computation and randomized 
computation generalizes deterministic computation, we have Q{f) < R{f) < D{f) < N for all /. 
The three worst-case complexities are polynomially related: D{f) £ 0{R{f)^) |^ and D{f) G 
0{Q{f)^) i for all total /. 

Let ^ : {0, 1}''^ — > [0, 1] be a probability distribution. We define the average-case complexity of 
an algorithm A with respect to a distribution /i as: 

T^= 1^{X)Ta{X). 
The average-case deterministic, randomized, and quantum complexities of / with respect to ^ are 





= min 

Aev{f) 






= min 

Ae7^(/) 


^A 




= min 

AGS(/) 





Note that the algorithms still have to satisfy the appropriate output requirements (such as out- 
putting /(X) with probability > 2/3 in case of R^ or Q^) on all inputs X, even on X that have 
= 0. Clearly < R'^if) < D^^{f) < N for all and /. Our goal is to examine how large 

the gaps between these measures can be, in particular for the uniform distribution unif{X) = . 
The above treatment of average-case complexity is the standard one used in average-case anal- 



ysis of algorithms [26|. One counter-intuitive consequence of these definitions, however, is that the 
average-case performance of polynomially related algorithms can be super polynomially apart (we 
will see this happen in Section P). This seemingly paradoxical effect makes these definitions un- 
suitable for dealing with polynomial-time reducibilities and average-case complexity classes, which 
is what led Levin to his alternative definition of "polynomial time on average " Nevertheless, 
we feel our definitions are the appropriate ones for our query complexity setting: they are just the 
average numbers of queries that one needs when the input is drawn according to distribution //. 



3 Super-Exponential Gap between and 

Before comparing the power of classical and quantum computing, we first compare the power of 
deterministic and bounded-error algorithms. It is not hard to show that D^^^f{f) can be much 
larger then i?"™^(/) and Q"™^(/): 

Theorem 3.1 Define f on N variables such that f{X) = I iff \X\ > N/10. Then Q"™^(/) and 
^«m/(j) are 0(1) and D«™/(/) g n{N). 

Proof. Suppose we randomly sample k bits of the input. Let a = \X\/N denote the fraction of 
Is in the input and d the fraction of Is in the sample. The Chernoff bound (see e.g. 0]) implies 
that there is a constant c > such that 

Pr[a < 2/10 I a > 3/10] < 2"^''. 

Now consider the following randomized algorithm for /: 
^We thank Umesh Vazirani for drawing our attention to this. 
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1. Let i = 100. 



2. Sample ki = i/c bits. If the fraction di of Is is > 2/10, then output 1 and stop. 

3. If i < log A^, then increase i by 1 and repeat step 2. 

4. If i > log A^, then count \X\ exactly using queries and output the correct answer. 

It is easy to see that this is a bounded-error algorithm for /. Let us bound its average-case 
complexity under the uniform distribution. 

If a > 3/10, the expected number of queries for step 2 is 

log Af 

V Pr[ai < 2/10, . . . , a,_i < 2/10 I a > 3/10] • - < 
— ^ r 

j=100 

log TV . log TV 

£ Pr[ai_i < 2/10 I a > 3/10] • - < £ 2-(^-^) • - e 0(1). 

i=100 2=100 

The probability that step 4 is needed (given a > 3/10) is at most 2-^1°^^/^ = 1/N. This adds 
-^A'" = 1 to the expected number of queries. 

Under the uniform distribution, the probability of the event a < 3/10 is at most 2^^^'^ for some 
constant c'. This case contributes at most 2~'^ ^ {N + (logA^)^) G o(l) to the expected number of 
queries. Thus in total the algorithm uses 0(1) queries on average, hence G 0{1). Since 

< we also have Q«™/(/) G 0(1). 

Since a deterministic classical algorithm for / must be correct on every input X, it is easy to 
see that it must make at least A^/10 queries on every input, hence D«™/(/) > A^/10. □ 



Accordingly, we can have huge gaps between D"™/(/) and (5"™'^(/). However, this example tells 
us nothing about the gaps between quantum and classical bounded-error algorithms. In the next 
section we exhibit an / where Q"™-^(/) is exponentially smaller than the classical bounded-error 
complexity i?"™-^(/). 



4 Exponential Gap between and Q"™^(/) 

4.1 The Function 



We use the following modification of Simon's problem [25|:q 
Input: X = (xi, . . . ,X2"), where each Xi G {0, 1}". 

Output: f{X) = 1 iff there is a non-zero k G {0, 1}" such that for all i G {0, 1}" we have Xi^k = Xj. 

Here we treat i G {0, 1}" both as an n-bit string and as a number between 1 and 2", and © 
denotes bitwise XOR. Note that this function is total (unlike Simon's). Formally, / is not a Boolean 
function because the variables are {0, Ij'^-valued. However, we can replace every variable Xi by n 
Boolean variables and then / becomes a Boolean function of A'^ = n2" variables. The number of 
queries needed to compute the Boolean function is at least the number of queries needed to compute 
the function with {0, Ij^-valued variables (because we can simulate a query to the Boolean oracle 
by means of a query to the {0, l}"-valued input- variables, just ignoring the n — 1 bits that we are 

^The preprint [|l5| independently proves a related but incomparable result about another Simon-modification. 
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not interested in) and at most n times the number of queries to the {0, Ij^-valued oracle (because 
one {0, Ij^-valued query can be simulated using n Boolean queries). As the numbers of queries are 
so closely related, it does not make a big difference whether we use the {0, Ij^-valued oracle or the 
Boolean oracle. For simplicity we count queries to the {0, l}"-valued oracle. 

We are interested in the average-case complexity of this function. The main result is the 
following exponential gap, to be proven in the next sections: 

Theorem 4.1 For f as above, g"™^(/) < 22n + 1 and i?"™^(/) G l^(2"/2). 



4.2 Quantum Upper Bound 

The quantum algorithm is similar to Simon's. Start with the 2-register superposition J2ie{o i}" 
(for convenience we ignore normalizing factors). Apply the oracle once to obtain 

Measuring the second register gives some j and collapses the first register to 

i:Xi=j 

A Hadamard transform H maps bits |6) ^ -^(|0) + (-1)^|1)). Applying this to each qubit of the 
first register gives 

E E (1) 

i:xi=j i'e{0,l}" 

Here (a, b) denotes inner product mod 2; if (a, 6) = we say a and b are orthogonal. 

If f{X) = 1, then there is a non-zero k such that Xj = Xj^^ for all i. In particular, Xj = j iff 
Xi^k = j- Then the final state (||) can be rewritten as 

E E(-i)^"'^lO = E f E + 

i' e{0,l}" i:xi=j i'e{0,l}" \i:Xt=j J 



j'e{0,l}" \i:xi=j J 



Notice that \i') has non-zero amplitude only if {k,i') = 0. Hence if f{X) = 1, then measuring the 
final state gives some i' orthogonal to the unknown k. 

To decide if f{X) = 1, we repeat the above process m = 22n times. Let ii, . . . ,im G {0, 1}" be 
the results of the m measurements. If f{X) = 1, there must be a non-zero k that is orthogonal to 
all ir- Compute the subspace S C {0, 1}" that is generated hy ii, ... ,im (i-e. S is the set of binary 
vectors obtained by taking linear combinations of ii, . . . ,im over GF(2)). If = {0, 1}", then the 
only k that is orthogonal to all v is A: = 0", so then we know that f{X) = 0. If S" 7^ {0, l}", we 
just query all 2" values X0...0, • • • and then compute f{X). Of course, this latter step is very 

expensive, but it is needed only rarely: 

Lemma 4.2 Assume that X = {xq,,,o, . . . ,xi,,,i) is chosen uniformly at random from {0,1}''^. 
Then, with probability at least 1 — 2"", fiX) = and the measured ii,. . . ,1^ generate {0, 1}". 
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Proof. It can be shown by a small modification of |jl], Theorem 5.1, p. 91] that with probability at 
least 1 - 2-^2" (c > 0), there are at least 2"/8 values j such that Xj = j for exactly one i £ {0, 1}" 
(and hence f{X) = 0). We assume that this is the case in the following. 

If zi, . . . , im generate a proper subspace of {0, 1}", then there is a non-zero k G {0, 1}" that is 
orthogonal to this subspace. We estimate the probability that this happens. Consider some fixed 
non-zero vector k G {0, 1}". The probability that ii and k are orthogonal is at most y|, as follows. 
With probability at least 1/8, the measurement of the second register gives j such that /(i) = j for 
a unique i. In this case, the measurement of the final superposition (||) gives a uniformly random 
i' . The probability that a uniformly random i' has {k,i') 7^ is 1/2. Therefore, the probability 
that {k, ii) = is at most 1 — | ■ | = jf- 

The vectors ii, . . . ,im are chosen independently. Therefore, the probability that k is orthogonal 
to each of them is at most = (jf)^'^" < 2"^". There are 2" — 1 possible non-zero k, so the 

probability that there is a A; which is orthogonal to each of ii, . . . is < (2" - l)2'2n ^ 2-". □ 

Note that this algorithm is actually a zero-error algorithm: it always outputs the correct answer. 
Its expected number of queries on a uniformly random input is at most m = 22n for generating 
ii, . . . ,ijn and at most 2^2" = 1 for querying all the Xi if the first step does not give ii,. . . ,im 
that generate {0,1}". This completes the proof of the first part of Theorem [4.1| . In contrast, in 
the appendix we show that the worst-case zero-error quantum complexity of / is il.{N), which is 
near-maximal. 

4.3 Classical Lower Bound 

Let Di be the uniform distribution over all inputs X G {0, 1}^ and D2 be the uniform distribution 
over all X for which there is a unique k ^ such that Xj = Xj^^ (and hence f{X) = 1). We say an 
algorithm A distinguishes between Di and D2 if the average probability that A outputs is > 2/3 
under Di and the average probability that A outputs 1 is > 2/3 under D2. 

Lemma 4.3 If there is a bounded-error algorithm A that computes f with m = J^"'-^ queries on 
average, then there is an algorithm that distinguishes between Di and D2 and uses 0{m) queries 
on all inputs. 

Proof. Without loss of generality we assume A has error probability < 1/10. To distinguish Di 
and we run A until it stops or makes 10m queries. If it stops, we output the result of A. If it 
makes 10m queries and has not stopped yet, we output 1. 

Under Z^i, the probability that A outputs 1 is at most 1/10 -|- o(l) (1/10 is the maximum 
probability of error on an input with f{X) = and o(l) is the probability of getting an input with 
f{X) = 1), so the probability that A outputs is at least 9/10 — o(l). The average probability 
(under Di) that A does not stop before 10m queries is at most 1/10, for otherwise the average 
number of queries would be more than ^(lOm) = m. Therefore the probability under Di that A 
outputs after at most 10m queries, is at least (9/10 — o(l)) — 1/10 = 4/5 — o(l). In contrast, the 
Z)2-probability that A outputs is < 1/10 because f{X) = 1 for any input X from D2- This shows 
that we can distinguish Di from D2. □ 

Lemma 4.4 A classical randomized algorithm A that makes m G 0(2"/^) queries cannot distinguish 
between Di and D2. 
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Proof. For a random input from Di , the probability that all answers to m queries are different is 

V 2"/ V 2" / ~ ^" ^ 

For a random input from D2, the probability that there is an i such that A queries both Xi and 
Xi^k is the hidden vector) is < ('^)/(2" — 1) G o(l), since: 

1. for every pair of distinct the probability that i = j © /c is 1/(2" — 1) 

2. since A queries only m of the Xi, it queries only (™) distinct pairs i,j 
If no pair Xi, Xi^k is queried, the probability that all answers are different is 

It is easy to see that all sequences of m different answers are equally likely. Therefore, for both 
distributions Di and D2 , we get a uniformly random sequence of m different values with probability 
1 — 0(1) and something else with probability o(l). Thus A cannot "see" the difference between Di 
and D2 with sufficient probability to distinguish between them. □ 



The second part of Theorem 4J. now follows: a classical algorithm that computes / with an 
average number of m queries can be used to distinguish between Di and D2 with 0{m) queries 
(Lemma O), but then 0{m) G Jl(2"/2) (L emma hA). 



5 Super-Exponential Gap for Non-Uniform /i 

The last section gave an exponential gap between Q'^ and i?'' under uniform /u. Here we show 
that the gap can be even larger for non-uniform /i. Consider the average-case complexity of the 
OR-function. It is easy to see that D""^-^(OR), i?""^^(OR), and Q""^^(OR) are all 0(1), since the 
average input will have many Is under the uniform distribution. Now we give some examples of 
non-uniform distributions fi where Q^{OYi) is super-exponentially smaller than i?^(OR): 

Theorem 5.1 If a e (0, 1 /2) and n{X) = c/(|^|) {\X\ + 1)°'{N + 1^"°' (c^l-a is a normalizing 
constant), then R''{OR) e e(iV") and Q'^(OR) G G(l). 

Proof. Any classical algorithm for OR requires 0(iV/(|X| + 1)) queries on an input X. The 



upper bound follows from random sampling, the lower bound from a block-sensitivity argument |21 
Hence (omitting the intermediate Os): 

N ^ rN°' 

where the last step can be shown by approximating the sum over t with an integral. Similarly, for 
a quantum algorithm Q{\/N/{\X\ + 1)) queries are necessary and sufficient on an input X ||,§, 
so 

I 1^ N ATa-1/2 
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□ 

In particular, for a = 1/2 — e we have the very large gap of 0(1) quantum versus Q.{N^/'^~^) 
classical. Note that we obtain this super-exponential gap by weighing the complexity of two algo- 
rithms (classical and quantum OR-algorithms) which are only quadratically apart on each input 
X. This is the phenomenon we referred to at the end of Section |2[ 

6 General Bounds for Average-Case Complexity 

In this section we prove some general bounds. First we make precise the intuitively obvious fact 
that if an algorithm A is faster on every input than another algorithm B, then it is also faster on 
average under any distribution: 

Theorem 6.1 // : R ^ R is a concave function and Ta{X) < (j){TB{X)) for all X, then 
< 0(T^) for every /x. 

Proof. By Jensen's inequality, if (p is concave then Efj_[(p{T)] < (j){E^[T]), hence 

T^= f^{X)TA{X)< J2 KX)HTb{X)) < cP ( KX)Tb{X)\=<I>{T^). 

xe{o,i}^ X6{o,i}^ \xe{o,i}'v / 

□ 

In words: taking the average cannot make the complexity-gap between two algorithms smaller. 
For instance, if Ta{X) < \JTb{X) (say, A is Grover's algorithm and i? is a classical algorithm for 

OR), then < y^Tg. On the other hand, taking the average can make the gap much larger, as 
we saw in Theorem ^.1| : the quantum algorithm for OR runs only quadratically faster than any 
classical algorithm on each input, but the average-case gap between quantum and classical can be 
much bigger than quadratic. 

We now prove a general lower bound on R^^ and Q^. The classical case of the following lemma 



was shown in [21|, the quantum case in 



Lemma 6.2 Let A he a bounded-error algorithm for some function f. If A is classical then 
Ta{X) £ Q,{bsx{f)), and if A is quantum then Ta{X) G ^l{^/bsx(J)) . 

A lower bound in terms of the //-expected block sensitivity follows: 



Theorem 6.3 For all f, ^i: R^'if) G 0(E^[5sx(/)]) and Q^(/) G n{E^[y/b^]df)])- 

7 Average-Case Complexity of MAJORITY 

Here we examine the average-case complexity of the MAJORITY-function. The hard inputs for 
majority occur when t = \X\ « N/2. Any quantum algorithm needs Q{N) queries for such 
inputs Q. Since the uniform distribution puts most probability on the set of X with \X\ close 
to N/2, we might expect an Q,{N) average-case complexity as well. However, we will prove that 
the complexity is nearly \/]V. For this we need the following result about approximate quantum 



counting, which is Theorem 13 of g (this is the upcoming journal version of g and ||l^; see 
also m. Theorem 1.10]): 
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Theorem 7.1 (Brassard, H0yer, Mosca, Tapp) There exists a quantum algorithm QCount 
with the following property. For every N-bit input X (with t = \X\) and number of queries T , and 
any integer k>\, QCount uses T queries and outputs a number t such that 



, JtiN - t) o,oN 

with probability at least S/vr^ if k = 1 and probability > 1 — l/2{k — 1) if k > 1. 

Using repeated applications of this quantum counting routine we can obtain a quantum algo- 
rithm for majority that is fast on average: 

Theorem 7.2 Q""^(MAJ) e 0{\^N{logNf). 

Proof. For ah i G {1, . . . , log A^}, define Ai = {X \ N/2^+^ < \\X\ - N/2\ < N/2'}. The 
probability under the uniform distribution of getting an input X ^ Ai'is fJ.{Ai) E 0{^/N /2*), since 
the number of inputs X with k Is is G 0(2^/\/iV) for all k. The idea of our algorithm is 
to have logA^ runs of the quantum counting algorithm, with increasing numbers of queries, such 
that the majority value of inputs from Ai is probably detected around the ith counting stage. We 
will use Ti = 100 • 2* log N queries in the ith counting stage. Our MAJORITY-algorithm is the 
following: 

For i = 1 to logiV do: 

quantum count \X\ using Tj queries (call the estimate ii) 

if \ti - N/2\ > N/2\ then output whether U > N/2 and stop. 

Classically count |X| using N queries and output its majority. 

Let us analyze the behavior of the algorithm on an input X & Ai. For t = |X|, we have \t — N/2\ G 
(A^/2^+i,iV/2']. By Theorem with probability > 1 - 1/lOlogA^ we have U - t < N/2\ so 

with probability (1 - 1/10 log iV)i°g^ e^^/i^ > 0.9 we have U - t 
This ensures that the algorithm outputs the correct value with high probability. 

We now bound the expected number of queries the algorithm needs on input X. Consider the 
(i + 2)nd counting stage. With probability 1 — 1/10 log we will have \ii+2 — t\^ N/2'~^'^. In this 
case the algorithm will terminate, because 

\ii+2 - N/2\ >\t- N/2\ - \ii+2 -t\> N/2'+^ - N/2'+^ = N/2'+^. 

Thus with high probability the algorithm needs no more than i + 2 counting stages on input X. 
Later counting stages take exponentially more queries {Ti^2+j = 2''Tj+2), but are needed only with 
exponentially decreasing probability 0(1/2-' log A^): the probability that |tj+2+j — ^1 > N/2'^'^ goes 
down exponentially with j precisely because the number of queries goes up exponentially. Similarly, 
the last step of the algorithm (classical counting) is needed only with negligible probability. 
Now the expected number of queries on input X can be upper bounded by 

i+2 logTV , . , logN 

Y^T,+ Y: n.o( )<W0.2'+HogN+ Y: 100.2^+='GO(2MogAr). 

j=l k=i+3 ^ ^"6^^/ k=i+3 

Therefore under the uniform distribution the average expected number of queries can be upper 
bounded by e!=i^ K^i)0{T log A^) G 0{VN{\og Nf). □ 

The nearly matching lower bound is: 



< N/2' for all 1 < i < A^. 
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Theorem 7.3 Q''™^(MAJ) G J](^/iV). 

Proof. Let A be a bounded-error quantum algorithm for MAJORITY. It follows from the 
worst-case results of Q that A uses ^}{N) queries on the hardest inputs, which are the X with 
\X\ = N/2 lb 1. Since the uniform distribution puts ^}{l/^/N) probability on the set of such X, the 
average-case complexity of A is at least i}{l/y/N)Q{N) = r2(\/iV)- □ 

What about the classical average-case complexity of MAJORITY? Alonso, Reingold, and 
Schott II prove the bound L»"™^(MAJ) = 2N/3 - ^/8N/97^ + 0{logN) for deterministic classi- 
cal computers. We can also prove a linear lower bound for the bounded- error classical complexity, 
using the following lemma: 

Lemma 7.4 Let A € {1, . . . , \/iV}- Any classical bounded-error algorithm that computes MAJOR- 
ITY on inputs X with \X\ G {N/2, N/2 + A} must make 0,{N) queries on all such inputs. 

Proof. We will prove the lemma for A = \/iV, which is the hardest case. We assume without 
loss of generality that the algorithm queries its input X at T{X) random positions, and outputs 1 
if the fraction of Is in its sample is at least {N/2 + A)/A^ = 1/2 + 1 /\/iV- We do not care what the 
algorithm outputs otherwise. Consider an input X with \X\ = N/2. The algorithm uses T = T{X) 
queries and should output with probability at least 2/3. Thus the probability of output 1 on X 
must be at most 1/3, in particular 

Pr[ at least r(l/2 + l/\/iV) Is in sample of size T] < 1/3. 

Since the T queries of the algorithm can be viewed as sampling without replacement from a set 
containing A^/2 Is and A^/2 Os, this error probability is given by the hyper geometric distribution 

T 

E 

i=T(l/2+l/v^) 




Pr[ at least r(l/2 + 1/VN) Is in sample of size T] 



We can approximate the hyper geometric distribution using the normal distribution, see e.g. |19 
Let Zk = {2k — T)/\/T and ^{z) = /f^^ I'^dt, then the above probability approaches 



$(zt) - «'(^T(l/2+l/V7V))- 

Note that ^{zt) = <^{VT) 1 and that ^(^^(i/a+i/v^)) = ^{2^/TJN) 1/2 if T G o{N). Thus 
we can only avoid having an error probability close to 1/2 by using T G Vt{N) queries on X with 
|X| = A^/2. A similar argument shows that we must also use Vt{N) queries if \X\ = N/2 + A. □ 

It now follows that: 
Theorem 7.5 i?"™^(MAJ) G 0(iV). 

Proof. The previous lemma shows that any algorithm for MAJORITY needs ^{N) queries on 
inputs X with \X\ G [N/2,N/2 + \/iV]. Since the uniform distribution puts f^(l) probability on 
the set of such X, the theorem follows. □ 

Accordingly, on average a quantum computer can compute MAJORITY almost quadratically 
faster than a classical computer, whereas for the worst-case input quantum and classical computers 
are about equally fast (or slow). 
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8 Average-Case Complexity of PARITY 



Finally we prove some results for the average-case complexity of PARITY. This is in many ways 
the hardest Boolean function. Firstly, bsx{f) = ^ for all X, hence by Theorem |6.3| : 

Corollary 8.1 For every fi, i?'^(PARITY) G n{N) and Q^'(PARITY) G n{^/N). 



With high probability we can obtain an exact count of \X\, using 0(-\/ {\X\ + 1)N) quantum 
queries [^. Combining this with a fi that puts 0(l/\/iV) probability on the set of all X with 
|X| > 1 and distributes the remaining probability arbitrarily over the X with \X\ < 1, we obtain 
a distribution fi such that (PARITY) G 0{^/N). 

We can prove Q'^ (PARITY) < for any fj, by the following algorithm: with probability 1/3 
output 1, with probability 1/3 output 0, and with probability 1/3 run the exact quantum algorithm 



for PARITY, which has worst-case complexity N/2 13|. This algorithm has success probability 
2/3 on every input and has expected number of queries equal to A^/6. 
More than a linear speed-up on average is not possible if /i is uniform: 

Theorem 8.2 Q"™^(PARITY) G n{N). 

Proof. Let ^ be a bounded-error quantum algorithm for PARITY. Let B be an algorithm that 
flips each bit of its input X with probability 1/2, records the number b of actual bitflips, runs A 
on the changed input Y, and outputs A{Y) + b mod 2. It is easy to see that S is a bounded-error 
algorithm for PARITY and that it uses an expected number of queries on every input. Using 
standard techniques, we can turn this into an algorithm for PARITY with worst-case O(T^) queries. 
Since the worst-case lower bound for PARITY is A^/2 M, 13 1, the theorem follows. □ 
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A Worst-case Complexity of / 

In this appendix we will show a lower bound of ^}{N) queries for the zero-error worst-case complexity 
Qoif) of the function f on N = n2^ binary variables defined in Section ^. (We count binary queries 
this time.) Consider a quantum algorithm that makes at most T queries and that, for every X, 
outputs either the correct output f{X) or, with probability < 1/2, outputs "inconclusive". We use 
the following lemma from [^: 

Lemma A.l The probability that a T-query quantum algorithm outputs 1 can be written as a 
multilinear N-variate polynomial P{X) of degree at most 2T. 

Consider the polynomial P induced by our T-query algorithm for /. It has the following 
properties: 

1. P has degree d <2T 

2. if f{X) = then P{X) = 

3. if f{X) = 1 then P{X) G [1/2, 1] 

We first show that only very few inputs X G {0, 1}^ make f{X) = 1. The number of such 1-inputs 
for / is the number of ways to choose k G {0, 1}" — {0""}, times the number of ways to choose 2"/2 
independent Xi G {0,1}", which is (2"- - 1) • (2")^"/^ < 2"(^"/^+^). Accordingly, the fraction of 
1-inputs among all 2^ inputs X is < 2"(2"/2+i)/2"2" = 2-"(2"/2-i)^ r^^iese X are exactly the X 



that make P{X) ^ 0. However, the following result is known [^, 22]: 



Lemma A. 2 (Schwartz) If P is a non-constant N-variate multilinear polynomial of degree d, 
then 

|{XG{0,1}^|P(X)/0}| 



This implies d > n(2"/2 - 1) and hence T > d/2 > n(2"/4 - 2) « iV/4. Thus we have proved 
that the worst-case zero-error quantum complexity of / is near-maximal: 

Theorem A.3 Qo{f) G n{N). 
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